Skip to content

Conversation

@dlech
Copy link
Member

@dlech dlech commented Jan 4, 2026

Add a number of hacks to work around issues seen when flashing firmware on the EV3.

See individual commit messages for details.

Alternate to #2352.

The first patch on it's own should allow firmware flashing to complete without error (only warnings).

The rest of the patches aim to avoid triggering the warnings in the first place.

// 384 bytes (triggered by 65536 % 1018 = 384). To work around this, we
// always erase two sectors to make the last chunk be twice as big
// (131072 % 1018 = 768).
const eraseSize = sectorSize * 2; // flash memory sector size
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const eraseSize = sectorSize * 2; // flash memory sector size
const eraseSize = sectorSize * 2;

@jaguilar
Copy link
Contributor

jaguilar commented Jan 5, 2026

Affirmed that with 5ed3861, I can successfully flash the EV3 from my Windows computer.

@laurensvalk
Copy link
Member

laurensvalk commented Jan 5, 2026

Thanks! This seems to be working, although I'm seeing a separate HID detection issue. I'll defer that to a separate issue.

I noticed that the progress bar doesn't appear until after a few seconds when it hits 8%, so for a little while you're left wondering if anything happened. It would be nice to start at 0%.

When I put in a dedicated progress at the start, there seems to be a separate bug that if progress is 0 it won't show the percentage. It just shows progress:

@laurensvalk
Copy link
Member

function computeEv3ExtraLength(firmwareLength: number): number {
    // HACK: If the EV3 firmware size is not a multiple of 2 * 64KiB, then we
    // need to add some padding to avoid possibly triggering a bug where small
    // download payloads can cause bad replies. This is done by ensuring that
    // the last chunk is 1018 bytes.

    const maxChunkSize = 1018;
    const eraseSize = 2 * 64 * 1024;
    const remainder = firmwareLength % eraseSize;

    if (remainder === 0) {
        // Adding padding would cause an entire extra erase to be needed. Don't
        // do that and rely on a a different workaround (always erasing two sectors).
        return 0;
    }

    return maxChunkSize - (remainder % maxChunkSize);
}

Would things be a lot simpler if we write in chunks of 512 bytes?

@dlech
Copy link
Member Author

dlech commented Jan 5, 2026

I noticed that the progress bar doesn't appear until after a few seconds when it hits 8%, so for a little while you're left wondering if anything happened. It would be nice to start at 0%.

Yes, we never had such a slow first step on other hubs, so could use some work.

Would things be a lot simpler if we write in chunks of 512 bytes?

I tried that, but we get no response from the EV3 if the chunk size is anything other than 1018 (other than the last chunk).

@laurensvalk
Copy link
Member

Ah, makes sense now. Thanks!

@laurensvalk
Copy link
Member

Since this is titled EV3 USB hacks, I've added two more. I propose to include this PR in the next beta/labs build to see where things stand. We can restore the USB reset if we find that we need to.

dlech added 4 commits January 23, 2026 14:21
Add a hack to not fail if the EV3 bootloader sends an echo of the request
instead of the expected response. There is a known compatibility issue
with the EV3 bootloader USB and, e.g. Windows with USB 3.0 ports. This
apparently causes a race condition where commands that don't take long
to process before sending a response will have the request echoed back
instead of receiving the actual response. If this happens, we can
ignore it and assume the command was successful. It just won't work, e.g.
for the version command since that has a payload in the response.
Change from erasing 1 sector at a time to erasing 2 sectors at a time.
This works around a USB issue where commands that don't take long to
execute can receive an incorrect response.

The point of erasing 2 sectors at a time is that when we write the data
to the sectors that we just erased, the last (partial) chunk of data is
now twice as large, which makes it take twice as long to write. This
seems to be long enough to avoid the USB issue.
Add an extra progress between erasing and writing. This makes the
progress not stall for quite as long before progressing when flashing
smaller firmwares.
Add a function to pad EV3 firmware size to avoid potentially triggering
bugs in the EV3 bootloader. When we do a download command with too small
of a payload, the bootloader can send a bad response. We can work around
this by ensuring that the firmware size is such that the last chunk is
the maximum size. This way we never send a small chunk that could
trigger the bug.
@laurensvalk laurensvalk force-pushed the dlech branch 2 times, most recently from f751d1f to fcb0286 Compare January 23, 2026 13:32
src/usb/sagas.ts Outdated
if (hotPlugDevice !== undefined) {
yield* put(usbHotPlugConnectPybricks());

// Even though we get the connect event, the browser does not
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this only apply to Linux?

We should test on a slow machine like a Raspberry Pi to make sure the delay is long enough.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And we should report this as a bug to Chromium. Then it could be fixed in a reliable way. For example, it could be that they are using the udev add event instead of bind that triggers the event before udev rules have run instead of after.

Copy link
Member Author

@dlech dlech Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to drop the delay commit since I think we an do it in a more robust way (assuming there is a specific error we can handle). What is the actual error that you are getting?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently I already answered my own question last October. pybricks/support#2372 (comment)


exitStack.push(() => usbDevice.close().catch(console.debug));

// Always reset the USB device to ensure it is in a known state.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we already have a good idea of what the solution will be to replace this, I think we should go ahead and implement it now instead of doing a buggy release.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't currently know what the solution is, but I'm happy to wait if you think we have one.

I've not been able to reproduce any issues without the reset so far.

src/usb/sagas.ts Outdated
// if there is exactly one Pybricks device connected, we can connect to
// it, otherwise we should let the user choose which one to connect to
if (pybricksDevices.length === 1) {
// REVISIT: Device is passed as hotplug device, even though it is
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what needs to be revisited here. Is the problem just the technicality in the naming?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. This is not a hotplug. This gets called when you load the page and the hub is already connected. So we don't need the hotplug delay (but it doesn't hurt to keep).

Copy link
Member Author

@dlech dlech Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be (very unlikely but still possible) that the USB was plugged in at the same time the page was loaded, so probably best to keep it and not differentiate.

This operation does not work on Windows, and does not appear to be
crucial on other platforms either. The browser seems to already handle
resetting the endpoints as needed when connecting or closing the browser
(at least on Linux)
@dlech dlech enabled auto-merge (rebase) January 24, 2026 22:12
@dlech
Copy link
Member Author

dlech commented Jan 24, 2026

I know we still need the delay, but I think there is some room for improvement on that patch, so I'm going to go ahead and merge this and we can start a new PR for the delay.

@dlech dlech merged commit 3abc3f3 into master Jan 24, 2026
2 checks passed
@dlech dlech deleted the dlech branch January 24, 2026 22:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants